Overview
Thefetch_all_ohlcv.py script retrieves historical daily OHLCV (candlestick) data for all stocks. It implements smart incremental updates by detecting existing data and only fetching missing days, plus hybrid live snapshot integration to include today’s intraday data.
Purpose
Fetches historical price and volume data:- Open, High, Low, Close prices (daily candles)
- Volume (daily trading volume)
- Incremental Updates - Only fetches missing dates
- Live Integration - Merges today’s live data with historical data
- Chunked Fetching - Handles large date ranges in 180-day chunks
API Endpoints
Historical OHLCV Endpoint
https://openweb-ticks.dhan.co/getDataHPOSTLive Snapshot Endpoint
https://ow-scanx-analytics.dhan.co/customscan/fetchdtPOSTRequest Payloads
Historical Data Request
Parameters
Exchange code (NSE for National Stock Exchange)
Stock symbol (e.g., “RELIANCE”)
Market segment (
E for Equity)Instrument type
Security ID (Sid) from
master_isin_map.jsonExpiry code (0 for equity stocks)
Candle interval:
D- Daily candles (most common)W- Weekly candlesM- Monthly candles1,5,15,30,60- Intraday minutes
Start timestamp (Unix epoch). Script uses
215634600 (Oct 31, 1976) to fetch maximum available history.End timestamp (Unix epoch). Typically current timestamp.
Live Snapshot Request
Output Files
Per-stock OHLCV data in CSV format:
- Date: YYYY-MM-DD format
- Open: Opening price
- High: Intraday high
- Low: Intraday low
- Close: Closing price (or LTP for today)
- Volume: Trading volume
Function Signatures
Main Fetching Function
Live Snapshot Function
Dependencies
requests- HTTP clientjson- JSON processingos- File operationstime- Timestamp calculationscsv- CSV file reading/writingdatetime- Date parsing and formattingconcurrent.futures.ThreadPoolExecutor- Parallel execution
pipeline_utils.BASE_DIR- Base directory pathpipeline_utils.get_headers()- API headers with Origin header
dhan_data_response.json- Raw market data with Sid field
Configuration
Number of days per API request chunk. Prevents timeout on large date ranges.
Number of concurrent download threads
Code Example
Usage
Performance
- Execution Time:
- First run (full 2-year history): ~25-30 minutes for 2,775 stocks
- Incremental updates: ~2-5 minutes (only fetches missing days)
- API Calls: Variable (depends on missing date ranges)
- Output: 2,775 CSV files in
ohlcv_data/directory - Concurrency: 15 parallel threads
- Chunk Size: 180-day chunks to prevent timeouts
Incremental Update Logic
- Check Existing File: If
{SYMBOL}.csvexists, read last date - Calculate Gap: Determine missing dates from last date to today
- Chunk Fetching: Download missing data in 180-day chunks
- Merge: Combine existing + new data, deduplicate by date
- Live Integration: Append today’s live snapshot
- Save: Overwrite CSV with complete dataset
Hybrid Live Integration
Historical API typically lags by 1 day. The script solves this by:- Fetching live snapshots for all stocks at script start
- Appending today’s
{Open, High, Low, Ltp, Volume}to each stock’s data - Using
Ltp(Last Traded Price) as today’s Close - Deduplicating by date during merge (live data overwrites if date exists)
Notes
- Requires Sid: Stocks without
Sidare skipped - Smart Updates: Only fetches missing date ranges (not full history on every run)
- Maximum History: Uses start timestamp
215634600(1976) to force API to return all available data - Date Deduplication: Prevents duplicate dates when merging existing + new data
- CSV Format: Standard CSV with headers for easy import into analysis tools
- Default Lookback: 2 years (500 trading days) for technical indicator calculations (200 MA, etc.)